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The  digital  medial  axis  transform  (MAT)  represents  an|  image  subset  S  as  the  un¬ 
ion  of  maximal  upright  squares  contained  in  S  .  Brute-forge  algorithms  for  computing 
geometric  properties  of  S  from  its  MAT  require  time  O  (n  2),  where  n  is  the  number  of 
squares.  Over  the  past  few  years,  however,  algorithms  have  been  developed  that  com¬ 
pute  properties  for  a  union  of  upright  rectangles  in  time  O  ( n  logn  ),  which  makes  the 
use  of  the  MAT  much  more  attractive.  We  review^these  algorithms  and  also  present 
efficient  algorithms  for  computing  union-of-rectangle  representations  of  derived  sets  (un¬ 
ion,  intersection,  complement)  and  for  conversion  between  the  union  of  rectangles  and 
other  representations  of  a  subset. 
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The  digital  medial  axis  transform  (MAT)  represents  an  image  subset  S  as  the  union 
of  maximal  upright  squares  contained  in  S.  Brute-force  algorithms  for  computing  geometric 
properties  of  S  from  its  MAT  require  time  0  (n^),  where  n  is  the  number  of  squares.  Over 
the  past  few  years,  however,  algorithms  have  been  developed  that  compute  properties  for  a  | 
union  of  upright  retangles  in  time  0  (n  logn) ,  which  makes  the  use  of  the  MAT  much  more 
attractive.  We  review  these  algorithms  and  also  present  efficient  algorithms  for  computing 
union-of -retangle  representations  of  derived  sets  (union,  intersection,  complement)  and 
for  conversion  between  the  union  of  rectangles  and  other  representations  of  a  subset. 
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1.  Introduction 

The  medial  axis  transform  (MAT)  of  a  set  S  was  first  introduced  by  Blum 
[l].  The  MAT  can  be  defined  as  the  set  of  centers  and  radii  of  the  maximal  disks 
that  are  contained  in  5.  The  “disks”  can  be  of  any  desired  standard  shape.  For 
example,  if  S  is  a  subset  of  a  digital  image,  it  is  convenient  to  use  upright 
squares  of  odd  side  length  as  “disks”.  The  original  set  is  just  the  union  of  these 
locally  maximal  squares. 

In  general,  the  squares  overlap  and  the  number  of  squares,  say  n  ,  in  the 
MAT  of  a  region  is  large.  If  brute-force  algorithms  are  used,  using  the  MAT  to 
calculate  geometric  properties  of  a  region  requires  O  (n2)  computation,  which  can 
be  quite  large.  For  this  reason,  it  has  been  concluded  [2]  that  the  MAT  is  not  a 
very  good  region  representation. 

The  purpose  of  this  paper  is  to  show  that  the  MAT  is  not  such  an  unattrac¬ 
tive  region  representation.  In  the  geometric  complexity  literature  over  the  past 
few  years,  many  algorithms  have  been  published  that  compute  geometric  proper¬ 
ties  for  regions  represented  as  union  of  upright  rectangles  in  O  (n  log  n  )  time. 
(An  upright,  “rectilinear",  or  “iso-oriented”  rectangle  is  a  rectangle  whose  sides 
are  parallel  to  the  coordinate  axes.)  Clearly,  a  MAT  is  a  special  case  of  this 
representation  where  all  the  rectangles  are  squares  and  have  odd  side  lengths. 
We  will  review  and  discuss  algorithms  that  find  the  boundary  and  compute  the 
perimeter,  identify  the  connected  components,  and  find  the  area  and  other 
moments  of  a  union  of  upright  rectangles. 


Section  2  introduces  the  segment  tree  [3]  which  is  the  basic  data  structure 
used  in  the  algorithms.  Section  3  discusses  two  algorithms  [4,5]  that  find  the 
boundaries  and  perimeter  of  a  region.  In  Section  4,  we  discuss  the  connected 
component  algorithm  of  [6],  which  uses  a  priority  search  tree  [7].  We  also  present 
an  algorithm  to  solve  the  connected  component  problem  using  the  segment  tree 
instead  of  the  priority  search  tree.  Our  algorithm  has  the  same  time  and  space 
complexity  as  the  algorithm  in  [6].  Section  5  describes  an  algorithm  given  in  [9] 
to  find  the  area.  We  also  show  how  to  extend  the  algorithm  of  [7]  to  find  the 
centroid  and  other  moments  of  the  region. 

A  set  can  be  the  result  of  performing  set-theoretic  operations  on  given  sets. 
Section  6  presents  algorithms  to  compute  union  of  rectangle  representations  for 
such  derived  sets,  directly  from  the  representations  of  the  given  sets.  Conversion 
between  unions  of  rectangles  and  other  representations  is  discussed  in  Section  7. 

Another  approach  to  speeding  up  computation  based  on  the  MAT  is  to  aug¬ 
ment  it  with  additional  information.  Ahuja  and  Hoff  in  [12]  introduce  an  aug¬ 
mented  MAT,  or  AMAT;  it  makes  use  of  a  graph  structure  in  which  overlapping 
squares  are  joined  by  arcs.  This  graph  structure  takes  O  ( n  log  n  )  time  to  build. 
They  observe  that  if  the  average  number  of  neighbors  per  square  in  the  AMAT  is 
a,  most  O  (n 2)  computations  can  be  performed  in  O  (a  n)  time.  In  the  reported 
examples,  a  varied  from  8.3  to  21.9,  but  this  was  after  pruning  the  MAT  to  elim¬ 
inate  many  redundant  squares.  The  methods  described  in  this  paper  take 
O  ( n  log  n  )  time,  but  do  not  require  storage  of  the  graph  structure. 
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2.  Segment  Trees 

The  segment  tree,  introduced  in  [3],  is  a  useful  data  structure  in  many  of  the 
algorithms  which  solve  geometric  problems  involving  a  union  of  iso-oriented  rec¬ 
tangles  using  the  line  sweep  method.  A  segment  tree  is  a  special  binary  tree 
which  allows  fast  insertions  and  deletions  of  line  segments  when  the  tree 
represents  line  intervals. 

Let  [a  ,  b  ]  be  any  interval  with  b  -a  >  1  and  for  simplicity,  let  the  end¬ 
points  be  integers.  The  segment  tree  T  (a  ,  b  )  is  defined  as  follows:  T  (a  ,  b  )  has 
a  root  v ,  with  L(v)  =  a  and/2(t>)=6,  representing  interval  [a ,  b  ].  If 

b  -a  >  I  then  v  has  a  leftson  T  ( a  ,  [  )  and  a  rightson  T(  |_— -J  .  b  ).  If 

b  -a  =1  then  leftson(u  )  =  rightson(u  )  =  null.  An  interval  [c,d]C[a,6]  is 
represented  on  T  (a  ,  b  )  by  a  set  of  marked  nodes  consisting  of  the  first  node  v 
on  each  of  the  paths  from  the  root  of  T  (a  ,  b  )  such  that  [L  (v ),  R  (v  )]  C  [c  ,  d  ], 
i.e.,  [L  (parent(i; )),  R  (parent(v  ))]  2  [c  ,  rf  ].  See  Figure  I  for  an  example.  It  is 
clear  that  for  each  (c  ,  d  ],  the  children  of  a  node  cannot  both  be  marked  and  at 
each  level  of  the  tree  there  are  at  most  two  marked  rodes.  Since  T  [a  ,  6]  has 
height  1  4-  f  log2  k"\  where  k  =  b-a+l,  an  interval  [c  ,  d  ]  is  represented  by 
<‘2*(l-f[  log2  k~ 1)  =  0(log2  k)  nodes.  A  set  of  m  intervals  can  be  represented 
in  T  [a  ,  b  ]  by  appending  to  each  tree  node  the  list  of  intervals  w  hich  mark  the 
node.  Hence  a  segment  tree  T\a,a+k- 1]  with  m  intervals  needs 
O  (k  +m  log2  k  )  space  and  O  (k  +m  log2  k  )  time  to  be  built. 
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Figure  1:  The  segment  tree  T(  1,8)  of  the  interval  [1,8] 

The  starred  nodes  represent  the  interval  [2,7]  on  T[l,8] 
Node^2)is  empty,  (C^) is  full  and(l^)is  partial. 

To  insert  (or  delete)  an  interval  segment,  find  the  corresponding  marked 
nodes  and  add  (or  delete)  the  segment  from  each  node’s  interval  list.  Since  every 
node’s  interval  list  is  no  longer  then  m  ,  insertion  or  deletion  can  be  done  in 
0(t  logo  k)  time  if  t  is  the  time  required  to  insert  (or  delete)  an  element  from 
an  interval  list  of  size  <m.  In  general  t  =  log2  m  if  a  balanced  binary  search 
tree  is  used  when  m  is  large.  In  some  algorithms,  t  is  a  constant.  For  example, 
if  one  only  needs  to  know  how  many  segments  marked  a  node,  then  the  interval 
list  is  simply  an  integer  value  which  is  incremented  or  decremented.  In  this  case 
the  space  needed  to  store  T  is  linear. 

Given  T  (a  ,  6  )  and  an  interval  [c,rf]C[a,6],  each  node  veT(a.b)  is 
classified  as  being  (i )  empty  if  [c  ,  d  ]("')[£  ( v ),  R  ( v  )]  is  empty,  (ii)  full  if 
[L  ( v ),  /?(u)]C[c  .  d  ]  and  (Hi)  partial,  if  it  is  neither  empty  nor  full.  See  Figure 


1.  Given  a  set  of  intervals,  a  node  v  is  empty  (or  full)  if  and  only  if  it  is  empty 
(or  full)  with  respect  to  all  the  intervals. 

All  the  line  sweep  algorithms  in  the  following  sections  use  the  basic  segment 
tree  or  some  variation  of  it  to  organize  the  necessary  information,  for  easy  inser¬ 
tion  and  deletion  of  appropriate  interval  segments. 


3.  Boundaries  and  Perimeters 


The  union  of  a  set  of  n  rectangles  consists  of  one  or  more  connected  regions. 
Its  contour  consists  of  a  collection  of  disjoint  cycles,  composed  alternately  of  vert¬ 
ical  and  horizontal  edges,  which  specifies  the  outer  boundaries  as  well  as  the 
holes,  if  any,  of  the  region.  See  Figure  2.  The  algorithms  in  this  section  can  be 
used  to  find  the  boundaries:  and  perimeter  of  the  region  from  its  medial  axis 
transform. 


Figure  2:  The  contour  is  the  set  of  two  cycles:  {abdefklmutsqa  ,  onhto  }. 

The  algorithm  in  [4]  determines  the  contour  in  two  phases:  the  first  finds  all 
the  vertical  edges  and  the  second  links  them  with  horizontal  edges.  The  first 
phase  uses  a  vertical  scan  line  which  sweeps  from  left  to  right.  The  horizontal 
edges  of  the  rectangles  divide  the  vertical  scan  li n ^  into  a  number  of  interval  seg- 


ments.  If  /  is  the  set  of  such  vertical  intervals  just  before  (or  after)  some  vertical 
left  (or  right)  edge  E  of  a  rectangle,  then  the  contribution  of  E  to  the  contour  of 
the  region  is  I'(~)E  where  I'  is  the  complement  of  I .  For  example,  in  Figure  2, 
just  before  edge  ar,I  —  {st,tu}  and  /'pj{ar  }={aq };  just  after  edge 
pm  ,  /  =  {ar  ,  mn  }  and  7'p|{pm  }  =  {on  }.  A  segment  tree  is  used  to  determine 
I'f^E .  First,  the  y -coordinates  of  the  horizontal  edges  are  sorted  and  mapped 
onto  {1 ,  m}  (m  <2 n)  where  each  horizontal  edge  (determined  by  its  y- 

coordinate)  is  associated  with  its  position  in  the  sorted  list.  A  segment  tree 
T(l,m  )  is  built.  When  the  vertical  scan  line  sweeps  from  left  to  right,  and  a  left 
edge  is  encountered,  the  y  -coordinates  of  its  two  endpoints  are  mapped  into 

{1 . m  }  and  the  segment  is  inserted  into  T(l,m  ).  Similarly,  a  right  edge  is 

deleted  from  T(l,m  ).  Note  that  at  each  node  in  T ,  only  a  count  of  the  number 
of  times  it  was  marked  needs  to  be  maintained.  This  value  is  increased  or 
decreased  as  the  node  is  marked  or  unmarked.  The  union  of  the  nodes  with 
nonzero  counts  represents  the  interval  I .  For  each  node  v  in  T ,  if  CONTR  (t>  ) 
denotes  [L  ( v  ),  R  ( v  )]p|/',  then 


CONTR  {v)  = 


(p  if  u  is  full  or  there  is  an  ancestor  u  of  d  such  that  u  is  full 
[L  (v  ).  R  (v  )]  if  v  is  empty 

CONTR  (LEFTCHILD(t;  ))|jCChV77?  (RIGHTCHILD(t> )) 
if  v  is  partial. 


Therefore  £  p/'  =  (J  CONTR  (v)  where  S  is  the  set  of  O  (log2  m  )  nodes 

v  eS 

representing  the  edge  E  in  T .  The  edge(s)  obtained  from  in  general  are 

unions  of  segments.  They  must  be  processed,  i.e.,  contiguous  intervals  must  be 
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collated. 

Once  the  vertical  edges  are  found,  the  horizontal  edges  can  easily  be  deter¬ 
mined.  Basically  the  set  of  endpoints  of  the  vertical  edges  is  sorted  in  ascending 
order  of  the  y -coordinates,  and  then  in  ascending  order  of  the  x -coordinates  if 
two  points  have  the  same  y  -coordinate.  In  the  sorted  list 

{xvy  i),  {xo,y2),{x3>y3)>  ■  '  -  V«i-i  —  i/2.  for  *>1-  Then  (ar2,_1. 2/2,-i)  and 

(x2j ,  y  2i )  are  the  endpoints  of  a  horizontal  edge. 

We  can  assign  directions  to  the  edges  consistently  so  that  we  can  determine 
if  a  cycle  represents  the  outer  boundary  or  a  hole  by  observing  the  direction  at 
some  ext’ erne  (say,  south-west)  corner  of  a  cycle.  The  direction  of  a  vertical  edge 
is  “up”  if  it  arises  from  a  left  edge  of  a  rectangle,  and  “down”  otherwise.  This 

can  easily  be  done  when  the  edge  is  identified.  In  phase  two,  if  at  each  endpoint 

( x ,  y  )  of  a  vertical  edge,  the  direction  d  of  the  vertical  edge  and  the  y- 
coordinate  y’  of  the  ether  end  of  the  vertical  edge  are  also  recorded,  then  the 
horizontal  edge  ( (x2l  _i,y2l  _i)(x2,  ,y2, ) )  ,  i.e.,  y  >,  _1  =  j/2l  ,  has  direction  (from 

left  to)  right  if  d2i- 1  ls  down  and  y2t-i  <  v'li-v  or  T  ^2,-1  >s  UP  an(^ 
I/21-1  <  y'n-v  Otherwise  the  horizontal  edge  has  direction  (from  right  to)  left. 

If  the  contour  has  p  edges,  fhe  above  algorithm  uses 

'~>n~ 

O  (n  log  rc  +p  log  -= - )  time  and  O  (n  +p  )  space. 

P 

In  [5],  an  additional  data  structure  called  the  contracted  segment  tree  is  used 
to  improve  the  time  complexity  to  O  (n  log  n  +  p  ).  To  find  the  vertical  edges, 
one  needs  to  compute  the  contribution  of  rectangle  edge  e  .  First  the  segment 
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tree  nodes  which  represent  the  y  -interval  of  e  are  located.  But  if  some  of  these 
nodes  or  their  ancestors  are  full  with  respect  to  the  segments  already  in  the  tree, 
their  contribution  to  Eni>  is  empty.  Thus  we  need  to  find  only  those  parts  of 
the  segment  tree  which  are  “free”  with  respect  to  the  segment  being  added.  This 
is  done  in  [4]  by  traversing  the  subtree  of  the  relevant  nodes,  and  not  reporting 
the  FULL  nodes.  In  [5],  the  "free”  parts  in  the  subtree  of  a  node  are  attached  to 
the  node  as  another  segment  tree,  the  contracted  segment  tree,  which  stores  only 
the  "gaps”  in  the  subtee  rooted  at  that  node.  This  allows  us  to  report  the  gaps 
in  O  («)  time  where  a  is  the  number  of  gaps  in  the  subtee  at  the  given  node,  and 
thus  yields  an  optimal  time  algorithm. 

Clearly,  the  contour  algorithm  can  be  easily  modified  and  simplified  to  find 
the  perimeter  which  equals  the  total  length  of  the  borders  of  the  regions.  It  also 
gives  us  the  number  of  connected  components  in  the  union  (just  count  the 


number  of  outer  borders). 


7.  Conversion  between  representations 

A  subset  can  be  specified  using  various  representations  other  than  a  union  of 
rectangles.  For  example,  it  can  be  specified  by  the  boundaries  of  its  regions,  by 
its  run  length  code,  or  by  its  quadtree  [2j.  This  section  discusses  conversion  of 
the  union-of-rectangles  representation  to  and  from  the  boundary  and  run  length 
representations.  Given  a  quadtree,  its  set  of  (black)  leaf  nodes  is  a  (non-minimal) 
union-of-rectangles  representation.  Conversely,  given  even  a  single  rectangle, 
depending  on  its  position,  the  corresponding  quadtree  can  have  O  (image  diame¬ 
ter)  leaves.  We  will  not  discuss  conversion  between  quadtrees  and  unions  of  rec¬ 
tangles  in  detail  here. 

7.1.  Boundaries 

The  algorithm  in  Section  3  finds  the  boundaries  of  a  subset  from  its 
representation  by  a  set  of  rectangles.  The  outer  boundaries,  given  by  the  (x ,  y  ) 
coordinates  of  the  vertices,  are  specified  in  one  direction  (clockwise),  and  the 
holes  are  specified  in  the  opposite  direction  (counterclockwise). 

Suppose  the  corners  (vertices)  of  the  contours  (boundaries)  of  a  subset  are 
given  using  the  above  convention.  We  first  consider  the  following  simple  algo¬ 
rithm  to  find  a  set  of  rectangles  whose  union  is  the  subset. 

1.  The  vertices  of  the  boundary  are  sorted  in  increasing  order.  Let 
u  i  <  u  2  <  '  '  <  um  be  the  distinct  x  -coordinates.  We  will  use  a  verti¬ 

cal  sweepline  which  stops  at  each  us  (1  <  s  <  m  ). 
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FigureS:  Regions  Sj  =  Pt,  S 2=  Qj  ■  The  P{ ’s  are  disjoint,  the 

1  =  i  i  =  i 

Q} ’s  are  disjoint,  and  S[  p|  S2  1S  the  union  of  rij  n2  disjoint  rectan¬ 
gles. 


rectangles  whose  union  is  the  complement.  In  fact,  the  sorting  step  in  the  algo¬ 
rithm  need  not  be  performed  since  it  can  be  done  in  stage  1.  The  time  complex¬ 
ity  of  stage  2  is  thus  O  (p  log  h  )  where  h  is  the  number  of  horizontal  edges  a 
vertical  sweep  line  can  cross.  The  complement  can  be  found  in 


O  (n  log  n  +  p  log  h  )  time,  where  p  <  n2  and  h  <  n  . 


by  intersecting  each  P,  in  Sx  with  each  Q}  in  So.  This  takes  O  (n  x  n2)  time 
and  the  intersection  has  a  maximum  of  n  x  n2  rectangles.  Figure  8  shows  the 
case  where  Sj  p|  S2  has  nl  n2  disjoint  rectangles,  and  each  P,  intersects  every 
Q; ;  in  this  case  it  takes  O  (n  t  n  2)  time  to  find  the  intersection.  One  can 
improve  the  efficiency  by  first  sorting  the  rectangle  vertices  in  increasing  (x ,  y  ) 
order:  sorted  list  L  x  for  the  P,  ’s  and  sorted  list  L  2  for  the  Qj ’s.  Now  one  can 
go  down  list  L  v  and  for  each  rectangle  P, ,  one  only  needs  to  intersect  it  with 
those  Qj  which  fall  in  the  range  of  P,- .  Of  course,  in  the  worst  case  (as  the 
example  in  Figure  8  shows),  this  still  takes  0(nx  n2)  time.  In  fact  any  algorithm 
would  need  to  take  at  least  O  (n  x  n2)  time  for  the  sets  shown  in  Figure  8. 

We  find  that  the  segment  tree  is  not  particularly  useful  for  this  problem.  In 
order  to  produce  the  rectangles  in  the  intersection,  one  needs  to  know  not  only 
the  active  segments  at  each  stopping  position  of  the  scan  line,  but  also  the  x- 
coordinate  of  the  left  edge  which  makes  the  segment  active.  Moreover,  to  search 
for  the  Qj ’s  which  intersect  a  vertical  edge  of  P,  may  entail  searching  the  entire 
subtree  of  the  node  corresponding  to  that  vertical  segment.  Again,  the  sets 
shown  in  Figure  8  would  cause  this  to  happen. 

The  complement  of  a  set  of  n  rectangles  with  respect  to  an  enclosing  outer 
rectangle  can  be  found  in  two  stages.  First  we  can  use  the  O  (n  log  n  +  p  )  algo¬ 
rithm  in  Section  4  to  find  the  contour  of  the  region;  p  is  the  number  of  edges  in 
the  contour.  The  contour  of  the  complement  is  this  contour  with  all  edge  direc¬ 
tions  reversed,  together  with  the  outer  rectangle.  Then  we  can  use  the  boundary 
to  union-of-rectangles  conversion  algorithm  in  Section  7.1  to  get  a  set  of 


8.  MATs  of  derived  sets 


The  algorithms  in  the  previous  sections  show  that  geometric  properties  can 
be  computed  from  MATs  quite  efficiently.  This  section  discusses  algorithms  for 
obtaining  MAT  representations  for  subsets  derived  from  given  subsets  by  set- 
theoretic  operations  such  as  union,  intersection,  complement  and  windowing, 
where  the  given  subsets  are  represented  by  MATs.  Since  it  is  well-known  that 
the  problem  of  finding  minimal  rectangle  covers  for  polygons  is  NP-hard  [10],  the 
problem  of  minimizing  the  numbers  of  rectangles  in  the  MAT  representations  of 
the  derived  sets  is  not  discussed. 

First  we  consider  the  problem  of  windowing.  Given  a  region  represented  by 
a  set  of  m  upright  rectangles  and  a  rectangular  window,  it  is  easy  to  find  that 
part  of  the  region  which  is  inside  the  window.  Since  the  intersection  of  two 
upright  rectangles  is  either  empty  or  another  upright  rectangle,  one  can  find  the 
intersection  of  me  window  with  each  of  the  n  given  rectangles  and  report  all  of 
the  non-empty  intersections.  Thus  windowing  is  a  linear-time  operation. 

Given  two  sets  of  rectangles,  each  representing  a  region,  the  set  union  of 
these  two  sets  represents  the  union  of  the  two  regions.  Of  course,  the  resulting 
set  of  rectangles  is  in  general  not  minimal.  Note  that  in  the  case  where  the  two 
given  regions  are  disjoint,  the  union  of  the  sets  of  ectangles  is  the  best  one  can 
do. 

n  j  n2 

The  intersection  of  two  regions  5,  =  |^J  P,  and  S.2  =  |^J  Q;  where 

.=i  '  j  =  i 

n !  n2 

P,  Qj  are  upright  rectangles,  5,  p)  52  =  (J  jj  (P,  p  Q}  ),  can  be  found 

.=i  j  =  i 
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M  =  - 7 - A/(»),  where  A/(* )  is  the  one-dimensional  measure  of 


1<i  <8 


A;  +1 


the  active  region  between  ui  and  u,  +1.  Since  M(i)  is  obtained  as  before  and  the 
u,  are  known,  we  can  evaluate  the  required  integral.  The  integral  J  J  yk  dx  dy 


can  be  evaluated  by  rotating  the  figure  by  90°  and  evaluating  JJ  xk  dx  dy . 

A 


The  integral  J  J  x  dx  dy  can  be  evaluated  concurrently  with  J  J  dx  dy  ,  and  thus 

A 

the  moments  can  be  obtained  in  the  same  time  complexity  as  the  area. 


dx  dy 
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known,  and  the  M(i)  are  evaluated  as  described  above.  Thus,  //*  dx  dy  can 

.4 

be  determined  in  time  O  (n  log  n  ).  J  J  dx  dy  is  the  area  of  the  region,  y  may 

A 

be  obtained  in  the  same  way,  rotating  the  axes  by  90  0 . 

The  moment  of  inertia  about  the  origin  is  given  by 

J  J  (x2  +  y2)  dx  dy  =  J  J  x2  dx  dy  +  J  J  y2  dx  dy. 

A  A  A 

In  general,  the  problem  of  evaluating  the  moments  is  one  of-  evaluating  integrals 

of  the  form  //**  dx  dy  and  Jf  yk  dx  dy .  For  example,  if  A  is  the  region 
A  A 

defined  by  the  union  of  the  rectangles  in  Figure  7,  then 


i  ill  iii 

i  iii  i  ii 


Ul  U2  U3U4  U5  U6U7  Ug 

Figure  7.  Illustration  of  moment  computation. 


When  inserting  an  interval  segment  into  the  tree,  instead  of  simply  marking  the 
nodes  as  in  Section  2,  we  mark  all  the  nodes  of  the  1-umbrella  of  that  segment. 
At  each  node  of  the  segment  tree,  the  following  two  values  are  maintained:  (1)  a 
count  of  the  number  of  times  it  figures  as  a  full  node  of  some  umbrella,  and  (2) 
the  total  length  of  the  fragments  in  its  subtree  which  are  covered  by  1-umbrellas 
through  it  or  below  it.  The  value  of  the  second  field  at  the  root  gives  M,  at  scan 
position  u, .  Deletion  of  a  segment  updates  the  above  values  too.  We  need  the 
count  field  (1)  because  a  given  node  can  belong  to  several  umbrellas.  Since  each 
partial  sum  can  be  found  in  O  (log  n )  time,  the  area  can  be  determined  in 
O  (n  log  n  )  time  and  the  segment  tree  only  needs  O  (n  )  space  for  n  rectangles. 

In  the  rest  of  this  section,  we  use  the  segment  tree  with  1-umbrellas  to  deter¬ 
mine  the  moments  of  the  region. 

The  centroid  (or  center  of  gravity)  (x ,  y)  of  a  region  A  is  given  by 


JJ*r  (x  ,  y)  dx  dy 

A 


ffP(x,y)dx  dy 


f  f  y  P  (x  ,  y)  dx  dy 
A 


J  J  P  (x  ,  y  )  dx  dy 
A 


where  P  (x ,  y )  is  the  density  at  point  ( x,y ).  If  we  assume  uniform  density 


throughout,  then 


J  J  x  dx  d  y 

_4 _ 

j  j  dx  dy 
A 


/  J  y  dx  dy 

A _ 

J  J  dx  dy 

A 


We  can  find  these  incrementally  at  each  stop  position  u,  of  the  vertical  scan  line 
by  observing  that  J  J  x  dx  dy  =  J]  —  (u  22+1  -  u,  2)M  (t  ).  The  u,  are 

A  2 


tained  in  the  interval  t  represents,  and  q  is  not  contained  in  t ’s  leftchild  or 
rightchiid.  In  other  words,  [v, ,  v}  j  is  split  at  t  where  the  beginning  part  of  it 
goes  to  t ’s  left  subtree  and  the  rest  goes  to  t  s  right  subtree.  Let  Ltip,  Rtip  be 
the  leftmost  (rightmost)  marked  node  for  q  in  the  left  (right)  subtree  of  t ,  i.e., 
L  (Ltip)=u,  ,  R  (Rtip)=v; .  Ltip,  Rtip  are  full  with  respect  to  q  . 

The  1-umbrella  of  a  segment  (v, ,  v}  ]  consists  of  the  node  t  as  defined  above, 
all  nodes  along  the  path  from  t  to  Ltip  and  their  rightchildren  (if  any),  and  all 
nodes  along  the  path  from  t  to  Rtip  together  with  their  leftchildren  (if  any).  See 
Figure  6.  Since  an  1-umbrella  has  at  most  four  nodes  at  each  level,  it  has  at 
most  O  (log  n  )  nodes  for  any  segment  and  it  can  be  built  in  O  (log  n  )  steps. 


Figureb.  A  l-umbrelia  of  a  segment  (v,  ,  u;  ].  Here  (v,  ,  t>;  ]C  [L  {t ),  R  {t )] 
but  each  child  of  t  contains  a  portion  of  [r,  ,  ]. 


5.  Area  and  moments 

The  area  (or  measure)  of  a  set  of  m  rectilinear  rectangles  is  the  area  covered 
by  (  =  the  number  of  pixels  in  )  their  union.  The  area  together  with  the  perime¬ 
ter  (see  Section  3)  gives  us  information  about  the  compactness  of  the  region.  The 
measure  problem  was  first  solved  in  1  and  2  dimensions  in  [3],  and  a  generalized 
solution  in  d,  dimensions  was  presented  in  [9]. 

The  algorithm  to  find  the  area  in  [9]  uses  the  sweep  line  method  and  a  ver¬ 
sion  of  the  segment  tree.  First  the  x  -coordinates  of  the  vertical  edges  are  sorted 
to  get  the  list  up  u2,  .  .  .  ,  uk  (Ic  <2n  ).  The  vertical  scan  line  will  be  posi¬ 
tioned  at  each  of  the  u,  ’s.  Let  XI,  be  the  length  of  the  active  interval  segments 
when  the  scan  line  is  at  u, ;  then  the  area  of  the  region  Xt  =  XI,  (u,  +1  -  u,  ). 

l  <  i  <  * 

Thus,  to  determine  the  total  area,  we  need  to  accumulate  the  partial  areas.  For 
this  we  need  to  determine  the  length  of  the  active  segments  at  each  stopping 
position  of  the  scan  line.  The  algorithm  will  calculate  X /1+1  by  applying  a  correc¬ 
tion  to  M,  . 

The  y -coordinates  of  the  horizontal  lines  are  sorted,  denoted  by 
{ i'  i .  t’o . Vi  },  l  <2  n.  The  segment  tree  T(\,  l)  is  built.  At  each  scan  posi¬ 

tion  u,  ,  l<i  <k ,  the  active  segments  are  marked  on  the  segment  tree  together 
with  some  information  described  below. 

Let  q  =  [  t>, ,  Vj],  v,  <  v}  be  a  vertical  interval  segment.  In  finding  and  mark¬ 
ing  the  tree  nodes  representing  [u,  ,  ]  (see  Section  2),  starting  from  the  root 

node,  let  t  be  the  node  in  T(l,  l )  such  that  [r,  .  v,  ] C [L  {t  ),  R  ( t  )],  i.e.,  q  is  con- 


Once  we  obtain  the  list  of  all  such  pairs,  we  collect  pairs  which  have  at  least 
one  component  in  common,  to  form  the  connected  components. 

This  algorithm  presented  in  [6j  determines  the  connected  components  of  n 
rectilinear  rectangles  in  O  (n  log  n  )  time  and  O  (n  )  space. 

Instead  of  using  a  priority  search  tree  as  in  [5],  we  can  use  a  segment  tree  to 
keep  track  of  the  active  intervals.  Specifically,  the  y -coordinates  of  the  horizon¬ 
tal  edges  are  sorted  and  mapped  to  {1,  .  .  .  ,  k  }(k  <2n  ).  A  segment  tree 
T(l,k)  is  built.  At  each  node  we  keep  a  count  of  the  interval  segments  which 
marked  the  node  and  one  of  its  descendants  (subintervals).  This  is  similar  to  the 
tree  used  in  Section  3  except  that  here  the  count  represents  partial  instead  of  full 
nodes.  When  a  leftside  (rightside)  vertical  segment  is  inserted  (deleted)  that 
count  of  all  the  nodes  along  the  path  from  the  root  to  the  marked  nodes  is 
increased  by  1.  Thus  insertion  and  deletion  can  be  done  in  O  (log  n  )  time.  To 
test  if  a  given  interval  intersects  any  of  the  intervals  in  the  segment  tree,  we 
locate  the  tree  nodes  v  representing  the  test  interval,  i.e.,  [L  (v ),  R  {v  )]C  test 
interval.  If  any  of  the  nodes  has  a  count  >0,  then  it  has  a  subinterval  that 
intersects  the  test  interval.  Again  this  can  be  done  in  O  (log  n  )  time.  Hence 
using  the  segment  tree  we  can  achieve  the  same  time  (O  (n  log  n  ))  and  space 
(O  (n  ))  complexity  as  the  algorithm  in  [6]. 


As  the  vertical  sweep  line  moves  from  left  to  right  passing  through  the  rec¬ 
tangles,  both  the  priority  search  tree  and  the  illuminator  tree  are  updated  as 
required.  When  a  new  rectangle  A  starts  in  the  illuminator  tree  we  may  have  to 
coalesce  a  collection  of  intervals  lying  between  the  endpoints.  For  each  interval 
B  being  coalesced,  if  B  intersects  any  of  the  current  priority  search  tree  inter¬ 
vals,  we  output  the  pair  (A  ,  B ).  For  example,  consider  Figure  5. 


sweep  line 


Figure  5.  Example  of  tree  updating. 


Just  before  the  sweep  line  reaches  edge  ah  ,  the  priority  search  tree  contains 
intervals  bd  and  eg  while  the  illuminator  tree  has  segments  ab  .  be  .  cf .  fg  .  gh  . 
On  meeting  ah  .  we  have  to  merge  intervals  ab  .  be  ,  cf ,  fg  and  gh  to  obtain  the 
new  illuminator  ah  .  Of  the  segments  being  coalesced,  only  be  .  cf  and  fg  share 


points  with  bd  and  eg  .  Thus,  we  report  pairs  (.1  .  D  ).  {B .  D  )  and  (C .  D  ). 


new  rectangle  starts  in  the  illuminator  tree,  we  must  do  the  following:  (1'  Locate 
the  two  endpoints  of  the  left  side  of  the  rectangle  in  the  current  set  of  intervals 
and  split  the  intervals  containing  the  endpoints.  This  can  be  done  in  Oflog  n  ) 
time.  (2)  Merge  all  t  intervals  lying  between  the  two  endpoints,  because  in  these 
parts,  the  new  rectangle  becomes  the  closest  rectangle  to  its  left.  This  can  be 
done  in  O  (t )  time. 


si  Rectangle  1 


p 

2 

L 
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□  Rectangle  2 
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■  s  4  Rectangle  3 
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r7 

sweep  line 


Figure  4.  S  0,  5  1 ,  .  .  .  ,  5  6,  5  7  are  the  segments  in  the  illumination  tree  at 
the  position  of  the  sweep  line.  In  parentheses  are  the  rectangles 
that  illuminate  each  of  the  segments. 


"dynamic"  priority  search  tree.  The  priority  search  tree  uses  0  ( n  )  spaces  to 
represent  n  intervals  using  a  balancing  scheme  as  is  done  in  working  with  AVL 
trees  [7].  Insertion  and  deletion  of  intervals  can  be  done  in  O  (log  k )  time,  and 
we  can  decide  if  a  given  interval  intersects  any  of  the  intervals  represented  on  the 
tree  in  O  (log  k)  time  where  k  <  n  is  the  number  of  nodes  in  the  tree.  In  the 
algorithm  to  find  the  connected  components,  the  priority  search  tree  is  used  to 
represent  the  vertical  intervals  which  are  the  intersections  of  the  current  sweep 
line  with  the  active  rectangles. 


Figure  3.  A  priority  search  tree  representing  the  intervals 
(1, 4),  (2, 4),  (4, 7),  (0,7),  (3, 5). 

The  illuminator  tree  represents  the  vertical  intervals  which  form  a  partition 
of  the  sweep  line,  such  that  each  interval  is  the  projection  of  the  left  side  of  the 
nearest  rectangle  to  the  left  of  the  sweep  line.  See  Figure  -4.  The  illuminator  tree 
can  be  implemented  as  a  balanced  2-3  tree  [8]  using  O  (n  )  space.  Each  node  in 
the  tree  represents,  say,  the  bottom  endpoint  of  the  corresponding  segment. 
When  a  rectangle  ends,  there  is  no  change  in  the  illuminator  tree.  But  when  a 
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4.  Connected  components 


Given  a  set  of  n  rectilinear  rectangles,  the  connected  components  of  the 
region  covered  by  the  rectangles  can  be  determined  in  time  O  (n  log  n  )  and 
space  O(n)  [6].  The  connected  components  are  specified  by  lists  of  rectangles 
where  the  rectangles  in  each  list  belong  to  the  same  connected  region.  The  algo¬ 
rithm  in  [6]  uses  the  line  sweeping  method  and  it  works  in  two  phases.  The  first 
phase  produces  a  list  of  pairs  of  rectangles  which  belong  to  the  same  connected 
component.  The  second  phase  traverses  the  graph  defined  this  pair  list  to  obtain 
a  list  of  rectangles  for  each  component. 


The  first  phase  of  the  algorithm  uses  two  data  structures,  the  priority  search 
tree  [71  and  the  illuminator  tree. 


A  priority  search  tree  T  (a  ,  b  ,  c  )  can  be  defined  as  a  binary  tree  such  that 
T  has  a  root  v  with  L  ( v  )=a  ,  R  (v  )=b  and  P  ( v  )—c  ;  if  |  R  {v  )~L  ( v  )  |  >1,  v 

has  a  leftson  T(L{v),  |_— ■ L  ^ j ,  d)  where  c  <d;  and  v  has  a  rightson 

7(1  L  (‘L  ,  R  {v  ),  d  ')  where  c  <d'.  Thus  the  T  [L  (root),  R  (root))  i 


is  a 


segment  tree  as  defined  in  Section  2,  and  the  P( r  )’s  satisfy  the  properties  of  a 
priority  queue.  This  priority  search  tree  represents  a  set  of  n  intervals 

(••>  \.e  ,) . (sn  ,e„  )  where  s,  <  e,  ,  and  the  .s,  's  are  distinct.  The  values  of  the 

■s,  's  are  used  to  build  a  segment  tree  T  (a  ,  b  )  (a  =  minimum  of  the  s,  "s. 
b  =  maximum  of  the  's  +1).  and  each  interval  is  associated  with  a  tree  node  v 
such  that  L{r)<st  <  R  ( u  )  and  P  (v)~e,  .  See  Figure  3.  [6]  also  discusses  how 
the  constraints  of  distinct  .s  values  may  be  removed  and  how  to  create  a 
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'■  -**  -*•  - 


2.  Initialize  a  list  L  to  be  empty.  In  general,  L  is  an  ordered  list 
(Z 1 2,  .  .  .  ,  ld)  in  increasing  order.  It  represents  the  y -coordinates  of  all 
the  horizontal  edges  which  intersect  the  vertical  sweepline. 

3.  For  S :  —  1  to  m  do 
begin 

let  y1  <  y2  <  '  <  ^  be  the  coordinates  of  the  vertices 

with  x  -coordinate  ; 
for  i :  —  1  to  k  do 

if  y,  is  already  in  L  ,  i.e.,  /,  =  y,  for  some  j 

then  delete  Z;  from  L  ,  since  this  signifies  the  righthand  end 

of  a  horizontal  edge 

else  insert  y,  into  L  ; 

after  all  the  y,  ’s  are  properly  inserted,  L  =  (/1(  .  .  .  ,  l2t  ): 
for  i :  —  1  to  2t  by  step  2 

Output  rectangle  (/,  Z,^]  X  [ us  i.e.,  rectangles  with  vertical 

edges  Z,  to  Zl+1  and  horizontal  edges  us  to 

end. 

The  above  algorithm  uses  a  vertical  sweep  line  going  from  left  to  right.  The 
sweep  line  stops  at  each  vertical  edge,  finds  all  the  horizontal  edges  it  crosses  and 
outputs  rectangles  of  width  =  next  stop  position  -  current  stop  position.  If  there 
are  n  vertices  on  the  contours,  the  time  complexity  of  the  algorithm  is 
O  (n  log  n  +  nh)  where  h  =  the  maximum  number  of  horizontal  edges  any 
vertical  line  crosses;  h  can  be  as  large  as  n  in  the  worst  case.  The  number  of 


rectangles  produced  is  also  O(n  h).  Figure  9  shows  that  a  long  rectangle  could 
be  cut  (unnecessarily)  into  many  small  rectangles  by  this  algorithm  since  all  rec¬ 
tangles  output  are  of  width  u,  +1  -  u,  .  Also  at  each  t,  ,  the  entire  list  L  is  exam¬ 
ined. 

The  following  algorithm  uses  the  same  basic  principle  as  the  one  above,  but 
an  element  of  L  is  examined  only  if  it  is  being  deleted  (the  righthand  end  of  a 
horizontal  edge)  or  if  some  new  horizontal  edges  are  being  inserted  next  to  it.  A 


rectangle  is  output  when  it  can  no  longer  be  extended  to  the  right  because  of  the 
presence  of  a  corner  or  some  other  vertical  edge.  In  this  algorithm,  the  endpoints 
of  a  horizontal  edge  need  to  know  the  direction  of  the  edge  (—  or  «— ,  using  the 
convention  that  clockwise  is  outer  boundary,  counterclockwise  is  a  hole)  and  each 
/,  of  L  must  also  record  the  direction  and  a  value  x-tag  which  indicates  the  left¬ 
most  position  of  the  edge  which  has  not  been  included  in  any  of  the  rectangles 
output  so  far. 

1.  Each  vertex  v  —  (x  ,  y  )  of  the  boundary  determines  dir  (x  ,  y  ),  the  direction 

of  the  horizontal  edge  at  the  vertex  (its  value  is  either  — *  or  <— ). 

2.  The  vertices  are  sorted  in  increasing  (x ,  y )  order.  Let 

“i  <  u2  <  '  '  <  um  be  the  distinct  x -coordinates.  For  each 

us  (1  5:  s  <  171 )’  let  y*  i  <  II32  <  '  '  ’  <  y^,  be  the  //-coordinates  of  the 

vertices  with  x -coordinates  u3 . 

3.  L  is  a  list  (/,  l2  ,  .  .  .  ,  lIast )  such  that  l,  =  (F-vaL£>,  X-tag).  L  is  ini¬ 
tialized  to  \{yn,  dir  ( u  j  yn),  u  J . (//Uj,  dir(uv  //,*,).«,))  . 

-4.  For  5:  =  2  to  m  do 

(*  At  each  sweepline  position,  examine  each  corner  of  the  boundary  hav¬ 
ing  this  x  value  *) 
for  f :  =  1  to  k3  do 

if  no  element  in  L  has  >'-val  ==  ys  t 

then  (*  the  beginning  of  a  new  horizontal  edge  *) 


j 


let  a  =  [  y3  t,  dir  {us ,  y3  t ),  u3). 

if  a  is  to  be  inserted  between  Z,  and  Z)+1  for  some 
1  <  i  <  last 

then  (*  a  is  not  new  first  element  or  last  element  of  L  *) 

(*  look  at  Z,  and  Z,+1  *) 

begin 

if  /,• .  D  =  ♦—  and  Zl+1.  D  =  —*  and  l,  _Y-tag 
=s  /,+1.X-tag  <  us  then  output  [/,  .  Y-val. 
Z1+l  Y-val]  X  [/,•  .AT -tag,  u3  ], 

and  set  Z,  X -tag  and  Z,  +1.JY -tag  to  u3 . 

end; 

insert  a  in  L 

end 

else  (*  an  element  Z;  in  L  has  Y -val  =  y3  t ,  i.e.,  the  end  of  a  hor¬ 
izontal  edge  in  L  *) 

begin 

suppose  lj  .  Y-val  =  y3  t ;  then  we  need  to  look  at 
lj_l  or  lj  ^  j  to  determine  the  rectangle  to  output, 
if  dir{u3  ,  y3t )  =  — 

then  output  [lj_1.  Y -val.  y3t  ]  X  [lj  .  Ar -tag.ii^  ] 
and  set  Z;_ j.A'-tag  to  u. 
else  (*  dir  ( u3  ,  yst  )  =  —  *) 

output  [j^.Z,^.  Y  -val]  X  [Z;  .  A' -tag.  h,  ] 
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and  set  lj+i.X -tag  to  u3  ; 
delete  from  L  . 

end 

Using  this  algorithm,  the  middle  rectangle  of  the  E  in  Figure  9  will  not  be 
cut  up  into  small  pieces.  The  number  of  rectangles  reported  is  O  (n  )  where  n  is 
the  number  of  corners  in  the  boundary,  since  every  rectangle  has  at  least  one 
boundary  vertex  on  one  of  its  edges.  The  time  complexity  of  this  algorithm  is 
O  ( n  log  n  +  nt )  where  t  is  the  time  for  inserting  or  deleting  an  element  of  L  . 
If  L  is  implemented  as  a  balanced  binary  search  tree,  insertion  and  deletion  take 
O  (log  h  )  where  h  <  n  is  the  maximum  number  of  edges  a  vertical  sweepline 
can  cross.  We  can  also  doubly  link  the  elements  of  L  in  increasing  order  to  allow 
easy  access  to  the  immediate  neighbors.  Therefore  the  time  complexity  of  this 
algorithm  is  O  (n  log  n  )  since  h  <  n  . 

7.2.  Run  length  codes 

Given  a  set  of  rectangles  we  can  obtain  its  run  length  code  by  reporting  the 
active  segments  on  each  row  as  runs  of  Is  and  the  inactive  segments  as  runs  of 
Os.  We  assume  that  the  vertices  of  the  n  rectangles  are  sorted  on  the  key  [ij  ,  x  ). 
We  also  assume  that  the  rectangles  are  all  contained  within  a  bounding  frame 
starting  at  x -coordinate  M  and  extending  up  to  x -coordinate  .V.  The  vertical 
dimension  of  the  frame  may  be  taken  to  be  the  vertical  span  of  the  set  of  rectan¬ 
gles  itself.  We  pass  a  horizontal  sweep  line  from  top  to  bottom. 


Let  L  be  a  sorted  list  of  nodes,  each  representing  a  horizontal  segment. 


Each  node  has  fields  LEFT  and  RIGHT  for  the  left  and  right  ends  of  the  seg¬ 
ments,  and  a  COUNT  field  to  represent  the  number  of  times  a  given  portion  of 
the  x  -  axis  was  covered  by  segments.  LINK  is  the  pointer  to  the  next  node  on  L  . 

Let  yx  >  y2  >  •  •  •  yk  be  the  list  of  distinct  y  -coordinates, 
for  i  :=1  to  k  do  : 
begin 

let  /j,  l2,  ■  ■  .  ,  lj  be  the  horizontal  segments  with  y  -coordinate  y,  . 
for  a  1  to  j  do  : 
begin 

if  la  is  the  top  end  of  some  rectangle  then 
begin  COUNTl  «—  0;  P  *—  start  of  L  : 

output  a  run  of  Os  of  length  LEFT(P  )  -  XI  +  1; 

repeat 

COUNTl  «-  COUNTl  +  RIGHT(P )  -  LEFT(P )  + 

l; 

if  RIGHT(P  )  =  LEFT(LINK(P  ))  then  (*  continue 
run  *) 

P  -  LINK  (P  ) 

else  begin  output  a  run  of  Is  of  length  COUNTl 


if  LINK(P)  =  nil  then  output  a  run  of  Os 

of  length  .V  -  RIGHT(P  )  +  1 

else  output  a  run  of  Os  of  length 


LEFT(LINK(P )  -  RIGHT(P )  +  1; 

P  «-  LINK(P);  COUNTl  «-  0 

end 

until  the  proper  place  is  found  for  4  on  the  sorted  list  L 
(i.eM  /a  lies  between  two  nodes  of  L  ). 

One  of  the  following  cases  must  hold: 

Cast  1:  4  is  covered  entirely  by  some  segments  on  L  : 
Split  those  segments  into  parts  covering  4  and 
parts  not  covering  la .  Increment  the  COUNT 
field  of  the  first  nodes  by  1. 

Case  2:  la  is  partially  covered:  Split  the  segments  into 
parts  covering  la  and  parts  not  covering  la  .  Insert 
that  part  of  la  not  covered,  into  L  and  increment 
the  COUNT  fields  of  the  covered  parts. 

Case  3:  la  is  not  covered:  Insert  la  with  a  COUNT  of  1. 

end  (*  if  la  is  top  end  *) 
else  (*  4  *s  a  bottom  end  *) 
begin 

Scan  the  list  L  as  before,  in  the  if  part,  output¬ 
ting  runs  of  Os  and  Is,  but  decrement  the 
COUNT  fields  of  all  segments  contained  in  la  by 
1.  Delete  those  segments  whose  COUNT  becomes 


>T* 


end  (*  for  each  a  *) 
end  (*  for  each  »  *) 

In  this  way,  we  get  the  run  length  code  of  ail  those  rows  y,,  1  <  t  <  k  on 
which  a  horizontal  line  of  some  rectangle  is  incident.  There  are  0  (n )  stopping 
points  of  the  horizontal  sweep,  and  the  list  L  can  contain  O  (n  )  nodes,  because 
there  can  be  O  (n  )  distinct  vertical  coordinates.  The  time  complexity  is  0(n2). 
The  run  length  code  of  any  other  row  is  the  same  as  that  of  the  preceding  row. 

To  convert  run  length  code  representation  to  union-of-rectangles,  we  main¬ 
tain  a  list  L  of  nodes  representing  rectangles.  Each  node  has  a  “rowspread”  field 
and  a  “columnspread”  field.  The  rowspread  field  indicates  that  the  rectangle 
■  specified  by  that  node  spans  the  two  rows  specified  in  this  field.  The 

I 

columnspread  field  indicates  that  the  rectangle  specified  by  that  node  spans  the 
two  columns  specified  in  this  field. 

I  Initially,  L  is  empty. 

For  each  row  of  the  run  length  code  do: 

Let  the  run  be  a10s,a.7ls,a3  0s,a4ls,...,  a  o„  _ i  <  a2n  Is- 

i 

a  2n  -i-l  Os  . 

Let  current  position  on  list  L  be  the  start  of  list  L 

* 

[  For  each  run  of  Is  in  this  row  do: 


Scan  through  L  starting  from  the  current  position  until  a  node  /  is 
found  whose  columnspread  intersects  the  columnspread  of  the  current 
run  of  Is.  Output  rectangles  corresponding  to  all  nodes  before  / . 


Let  columnspread  (/ )  =  c  v  c  2. 

Let  columnspread  of  current  run  of  Is  =  dv  d2. 
case  1:  cl  =  dl,  c2  =  d2:  increase  rowspread  of  l 
case  2:  cx  =  dlf  dl>  d2:  split  l  by  increasing  its  rowspread  by  1 

and  making  the  columnspread  run  till  d2. 
Output  a  rectangle  with  rowspread  equal 
to  that  of  l  previously  and  columnspread 
from  d2  to  c  2. 

case  8:  c  x  =  c2,  d2  >  dx:  increment  rowspread  of  l  and  append  a 

new  node  after  l ,  whose  columnspread  is 
from  d  j  to  the  end  of  the  current  run  of 
Is. 

case  4  '  c  !  <  c  2,  d  x  =  d2:  split  /  by  incrementing  its  rowspread,  and 

changing  its  columnspread  to  start  from 
d  j.  Output  a  rectangle  with  rowspread  the 
same  as  that  of  /  and  columnspread  from 
C|  to  d 

case  5:  c  j  <  d ,  <  c  2  <  d2:  split  /  by  incrementing  its  rowspread. 

and  changing  its  columnspread  to 
dj  -  Co.  Output  a  rectangle  with  the  old 
rowspread  of  /  at.d  columnspread  from 
C|  to  dj.  Append  a  node  after  l  with 
columnspread  from  c2  to  do. 
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case  6:  dx<cx<d2<c  2:  split  l  by  incrementing  the  rowspread, 

and  changing  its  columnspread  to 
cx-d2.  Output  a  rectangle  with  old 
rowspread  of  l  and  columnspread  from 
d  2  to  c  2.  Append  a  node  before  /  with 
columnspread  d  x  to  c  x. 

case  7:  cx<dx<d2<c2:  split  l  by  incrementing  its  rowspread, 

and  changing  the  columnspread  to 
d  j  -  d  2.  Output  two  rectangles  with 
columnspread  equal  to  the  previous  value 
of  l  and  rowspread  from  cx  to  d  x,  and 
d  2  to  c  o* 

case  8:  dx<cx<c2<d2:  split  l  by  incrementing  its  rowspread, 

and  appending  two  nodes,  one  before  and 
one  after  /,  with  columnspreads  dx  to  c  x, 
and  c  2  to  d  2. 

The  rectangles  we  output  may  be  degenerate  ones,  i.e.,  a  single  point  or  a 
single  line.  This  may  be  avoided  by  increasing  the  resolution  so  that  single  pixels 
become  rectangles  with  non-empty  interior.  The  number  of  rectangles  is  bounded 
by  the  number  of  corners  in  the  contour  of  the  region,  which  is  <  the  number  of 
runs  of  Is.  In  the  worst  case,  the  time  complexity  is  O  (image  size)  when  the 
image  is  a  checkboard. 
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8.  Concluding  Remarks 


The  algorithms  in  this  note  all  use  the  line  sweep  method  and  the  segment 
tree  or  its  variants  as  the  supporting  data  structure.  They  are  time  optimal  algo¬ 
rithms:  O  (n  log  n+p)  for  the  contour  of  a  union  of  n  rectilinear  rectangles, 
where  p  is  the  number  of  contour  pieces;  O  (n  log  n  )  for  the  connected  com¬ 
ponents,  area,  centroid  and  moments. 

These  time  and  space  efficient  algorithms  can  be  useful  in  image  processing 
because  the  medial  axis  transform  of  a  region  is  a  set  of  rectilinear  squares.  They 
show  that  geometric  properties  can  be  obtained  from  the  MAT  quite  efficiently. 
Moreover,  often  regions  can  be  covered  by  a  lot  fewer  rectangles  [10]  than  the 
number  of  squares  in  a  MAT.  Thus  a  set  of  rectangles  is  a  useful  compact 
representation  of  regions. 

The  problems  we  considered  in  Sections  3-5  can  also  be  solved  using  divide 
and  conquer  methods  [11].  These  algorithms  divide  the  plane  into  frames  which 
are  vertical  strips  between  two  vertical  lines  (as  defined  by  the  vertical  edges  of 
rectangles).  Then  the  problem  is  solved  for  each  of  the  frames.  Some  informa¬ 
tion  which  allows  the  merging  of  the  solutions  is  also  calculated.  Finally  the 
solutions  for  the  subproblems  are  merged.  The  performance  of  the  divide  and 
conquer  algorithms  matches  those  of  the  line  sweeping  method. 

A  set  of  rectilinear  rectangles  can  be  obtained  from  other  representations  of 
regions  and  from  performing  set-theoretic  operations  on  given  MATs. 
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